Document Retrieval and Routing Using the INQUERY System

نویسندگان

  • John Broglio
  • James P. Callan
  • W. Bruce Croft
  • Daniel W. Nachbar
چکیده

The INQUERY retrieval and routing system, which is based on the Bayesian inference net retrieval model, has been described in a number of papers 5, 4, 10, 11]. In the TREC experiments this year, a number of new techniques were introduced for both the ad-hoc retrieval and routing runs. In addition, experiments with Spanish retrieval were carried out. For the ad-hoc retrieval experiments, the major changes to the system were the incorporation of passage retrieval, query expansion using PhraseFinder 8], a new estimation technique for indexing probabilities, and new analysis techniques for the TREC topics. The following description of query processing emphasizes the diierences to the approach used in the previous evaluations, which is described in 5, 3]. The text in all parts of the TREC topics (topic, description, narrative) were treated in the same way. The University of Massachusetts JTAG tagger 12] was used to identify parts of speech for words in the query. Then sequences of nouns, or sequences of adjectives and following nouns, were selected for the #PHRASE operator. (In the past, we had included prepositional phrases which followed a \bare" headnoun. We eliminated these phrases because we found they were not useful.) Analysis of phrase statistics has shown that phrases of length greater than two words were invariably too restrictive, especially when subphrases of these would have been useful in retrieval. We therefore used only two-word phrases. Sequences of more than two words were broken down into two-word subsequences. For example: (crude oil price trend)) (crude oil), (oil price), (price trend). Using this modiication of the original phrase extraction procedures, we were able to eliminate a number of special processing steps, and apply noun-phrase-extraction processing to all sections.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Tools and Old Habits: The Interactive Searching Behavior of Expert Online Searches using INQUERY

We present data that describe the interactive searching behavior of ten searchers using the INQUERY retrieval engine in the context of the TREC routing task We dis cuss how these searchers with a strong background in the use of traditional online retrieval mechanisms adapted after very limited training to the use of a best match ranked output full text retrieval mechanism

متن کامل

Recent Experiments with INQUERY

Past TREC experiments by the University of Massachusetts have focused primarily on ad-hoc query creation. Substantial eeort was directed towards automatically translating TREC topics into queries, using a set of simple heuristics and query expansion. Less emphasis was placed on the routing task, although results were generally good. The Spanish experiments in TREC-3 concentrated on simple index...

متن کامل

TREC and Tipster Experiments with Inquery

INQUERY is a probablistic information retrieval system based upon a Bayesian inference network model. This paper describes recent improvements to the system as a result of participation in the TIPSTER project and the TREC-2 conference. Improvements include transforming forms-based speciications of information needs into complex structured queries, automatic query expansion, automatic recognitio...

متن کامل

An Evaluation of Query Processing

The TIPSTER collection is unusual because of both its size and detail. In particular, it describes a set of information needs, as opposed to traditional queries. These detailed representations of information need are an opportunity for research on diierent methods of formulating queries. This paper describes several methods of constructing queries for the INQUERY information retrieval system, a...

متن کامل

TREC-2 Routing and Ad-Hoc Retrieval Evaluation using the INQUERY System

The ARPA TIPSTER project, which is the source of the data and funding for TREC, has involved four sites in the area of text retrieval and routing. The TIPSTER project (which includes MCC as a subcontractor), has focused on the following goals: Improving the eeectiveness of information retrieval techniques for large, full-text databases, Improving the eeectiveness of routing techniques appropria...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994